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Abstract 

This note proposes a method for pricing high-dimensional American 
options based on modern methods of muhidimensional interpolation. The 
method aUows using sparse grids and thus mitigates the curse of dimen- 
sionahty. A framework of the pricing algorithm and the corresponding 
interpolation methods are discussed, and a theorem is demonstrated that 
suggests that the pricing method is less vulnerable to the curse of dimen- 
sionality. The method is illustrated by an application to rainbow options 
and compared to Least Squares Monte Carlo and other benchmarks. 

The fundamental problem of options theory is the valuation of 
hybrid, non-linear securities, and options theory is an ingenious but 
glorified method of interpolation. 

Emanuel Derman "A guide for the perplexed quant" 



1 Introduction 

Lattice option pricing^ is very popular among practitioners because in one- 
dimensional situations it is straightforward to implement and has a transparent 
interpretation. In multiple dimensions it has two serious drawbacks: the need 
to build recombining trees and the curse of dimensionality. If branches of the 
tree do not recombine then the number of nodes grows exponentially with the 
number of time steps. Similarly, the number of points in a regular grid grows 
exponentially with the space dimension. For example, to approximate every 
point in the nine-dimensional hypercube with 10% precision we need one billion 
points. With twenty factors, the argument runs, experiments with IBM's fastest 
supercomputer will quickly convince us that the lattice method of pricing is 
impractical with many dimensions.^ 

'Cornerstone Research, 599 Lexington Avenue floor 43, New York, NY 10022; skar- 

guine@cornersto ne .com 

^Invented by |Cox, Ross, and Rubinstein (1979)| 

■^The speed of this supercomputer is around 10^^ operations per second, so 
an optimistic estimate of the execution time is 10^ seconds or about 100 days. 



This argument is usually well taken but essentially wrong. First, trees may 
very well have non-recombining branches and a moderate number of nodes. We 
only need to leave some nodes without descendants and to interpolate values on 
these nodes. Second, in the case of many dimensions, we can use irregular grids 
with a moderate number of points provided that we can interpolate the value 
function faithfully from its values on the irregular grid. 

A suspicion may arise that this just shifts the computational burden to 
the interpolation problem and that the curse of dimensionality will remain as 
dangerous as it was before. Indeed, traditional approximation theory says that 
accurate approximation of a general function needs a number of grid points that 
is exponential in the space dimension. 

Does this invalidate the idea of pricing by interpolation? No, because we 
deal with specific classes of functions which may be approximated better than 
an arbitrarily chosen function. As the simplest example, consider linear func- 
tions. They can be recovered from values on only d + 1 points in d-dimensional 
space. Recall also the classical Kotelnikov-Shannon sampling theorem that says 
that a function with limited bandwidth of its Fourier transform can be com- 
pletely recovered from the values it takes on a discrete sample of points. These 
examples suggest that for certain classes of functions the problem of approxima- 
tion may be satisfactorily solved even in multiple dimensions. Indeed, this hope 
is being realized in recent and continuing work on non-linear multidimensional 
approximation. This paper aims to apply these new ideas to option pricing. 

The essential idea is to choose the approximating functions adaptively. First, 
the researcher chooses a large class of functions that are easy to compute and 
that are suitable to the problem at hand. Second, the researcher allows the data 
to select the functions that are most fitting as a basis of approximation. As a 
result, the approximation is well adapted to the properties of the given function. 
The main practical and theoretical tasks are automating this procedure and 
exploring its convergence properties. 

The idea of using approximations for option pricing is not entirely new. Re- 



cently, it was implemented by Longstaff and Schwartz (2001) in their "simple 
but powerful" Least Squares Monte Carlo algorithm. Their method is based on 
the Monte Carlo pricing method and they assume that the researcher can guess 



good basis functions for approximations. Tsitsiklis and Roy (2001) describe a 
similar method and analyze its convergence properties. The cardinal distinction 
of the method described in this paper is that it suggests the adaptive choice of 
the approximating basis. This adaptive way of approximation allows construc- 
tion of an algorithm which is universally applicable to a wide range of possible 
options. 

As an additional benefit, the approximation provided by the new algorithm 
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allows easy pricing everywhere in factor space. In contrast, both the standard 
lattice and Monte Carlo methods produce option values for only one combination 
of factors. This benefit is especially useful if there is a need to visualize the 
dependence of option price on factors, or to compute hedge factors - sensitivities 
of the option value with respect to changes in factors. 

The remainder of the paper is organized as follows. Section 2 briefly reviews 
the problem of option pricing and relates it to the problem of approximation. 
Section 3 outlines the framework of the algorithm, illustrates it with a simple 
example, and briefly describes why adaptive approximation is likely to break 
the curse of dimensionality. Section 4 explaines how the approximation can be 
used to find lower and upper bounds on the option value. Section 5 applies the 
method to a set of benchmark options and compare the results with the results 
in the literature. Section 6 concludes. 



2 Option Pricing and Interpolation 

The problem is as follows. Let the price of a derivative security depend on 
N factors that follow a specified diffusion process. The derivative is of the 
American type and so can be exercised at any time. Assume also that the 
price of the derivative is not path-dependent. Then by the dynamic replication 
argument, the value of the derivative satisfies the familiar^ partial differential 
equation : 

i i,k 

Here f{x, t) is the value of the derivative at time t if the vector of factors is x. 

One way to solve this equation is to write it in an integral form using a certain 
measure /i over the space of Brownian motion paths. This is the Feynman-Kac 
representation^ of PDE solution as an integral from a functional of Brownian 
motion paths: 

/(x,0)=sup/ ^^{x{T),T)e-'''d^I{x{t)). (2) 

T Jx{t)(EW 

Here r is a stopping time, tt(x{t),t) is the payoff at time r if the factors are 
x{t), and W is the space of paths of the Wiener process. 

We can write the Wiener measure /i as a limit over a sequence of time 
discretizations with Gaussian transition probabilities. Then, each of the discrete 
time problems can be solved recursively through the Bellman equation that 
relates the current option value to the values at the next time stage: 

fix, t) = max l^n{x{t),t); ^ ^ f{y, t + M)e-''^'dji{x, y, t)| . (3) 

'^see for example , [Hull (1999)]or|Wilmott and Howson (1995)1 
5 see |Kac (1949)] or |Karatzas and Shreve (1991)| 
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Here is a time interval, djl{x,y,t) is the probability of transition from x to 
y consistent with the Wiener measure d/i, and X is the space of factors. 

The next step - crucial for our analysis - is to discretize the equation over 
space. This means choosing a grid G d X and a suitable approximation 
for djl. Here is where difficulties begin. The standard tree methods use a 
regular - usually cubic - lattice, and specify probabilities of transitions from 
each lattice point to nearby lattice points so as to match the covariance matrix 
of the continuous process. 

In multidimensional situations the number of points in regular lattices is 
prohibitively large. Therefore we have to use an irregular grid with large gaps. 
Transitions over such a grid are unlikely to approach the Wiener process uni- 
formly.^ So what to do? One solution is to use spatial interpolation as in the 
following formula: 



fix, t) = max J 7:{xit), t), [ /(y, t + Ai)e-'-^*d/i(a;, y,t)\ . (4) 
( Jyev{x) J 

Here x belongs to grid G and y to which is a set of descendants of point 

X, which may very well lie outside of the grid. The function f{y,t + At) is 
interpolated from the values of / on the grid points that were obtained in the 
previous step of recursion. Measure dp. approximates d^. 

In summary, the main idea of the method is to separate two different uses of 
the grid, which usually plays a crucial role in both approximating the evolution 
of the stochastic process and in keeping information about the option value 
function. We suggest using the grid only for the latter purpose and simulating 
the stochastic process by computing small clusters of descendant points around 
each grid point. The values on the descendant points are interpolated from 
values on the grid points. Clearly, the success of this idea crucially depends 
on the quality of the spatial interpolation. We will discuss modern methods of 
multidimensional interpolation immediately after presenting the outline of the 
algorithm and an example. 



3 Outline of Algorithm 

Here is the general outline of the algorithm: 

1) Generate G, a grid ~ possibly irregular - in the factor space. 

2) For each point x (z G, initialize the value function by computing the payoff 
at the final stage T. 

3) Begin recursion over t < T: Compute an approximation to the value 
function at stage t using an approximation at stage t + 1. 

This step can be realized in different ways. Since one part of this step is 
applying the backward integral operator that corresponds to the factor process, 

^See, however, [Berridge and Schumacher (2004)|for encouraging advances in this direction. 
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the algorithm becomes more precise if it can be computed analytically. If not, 
we can always proceed as follows: 

a) For each point x e G, compute a set of the states that follow x 
in a discrete approximation to the factor process. 

b) For each y e 'D(x), compute f(y, t + dt) by interpolating. 

c) Compute the continuation value function at point x as the discounted 
average of /(y, t + dt) over 'D{x). 

d) Compute the new value function by taking the maximum of the 
continuation value function and the exercise payoff. 

e) Proceed to the next step of the recursion. 

The success of this method depends on the quality of approximations to 
the continuation value fucntion. In multiple dimensions the most successful 
approach to functional approximation to date is by fixing an over-complete 
set of basis functions and then looking for an approximation recursively. The 
typical realization of this idea is by the relaxed greedy algorithm (RGA) , which 
proceeds by forming a convex combination of a ridge function 4>{hn+ix + c„-|_i) 
and the previous approximation fn{x): 

fn+l{x) = a(j)[bn+lX + Cn+l) + (1 - a.)fn{x), (5) 

and then estimating bn+i, Cn+i, and a. Here (j){t) is a function of onc-dimcnsional 
parameter, and hn+ix means the scalar product of vectors fo„+i and x. 

Barron (1993) advocates using a sigmoidal function consistent with the 
neural network literature, Jones (1992)| uses sinusoidal functions, and |Breiman (1993) 



suggests using a connected pair of half-hyperplanes. The specifics of our appli- 
cation call for a different choice. Since the approximation serves only as a 
tool for solving a partial differential equation, the preferable choice seems to be 
Gaussian functions that can be easily propagated backward by the action of the 
backward integral operator associated with the factor process. 

A question immediately arises: In which circumstances can recursive meth- 
ods find efficient approximations by Gaussians? Below is a theorem that answers 



this question by generalizing a theorem from Jones (1992) 
Let 

OO 

f{x) = '^aicj){x;Bi,Ci) +u{x), (6) 

i=l 

where (j^ix; B, C) is a multi-dimensional Gaussian with parameters B (precision, 
i.e. inverse of covariance matrix) and C (shift of the center). 
Assume that 

1) j:\a,] = K<^, 

2) M^l, (7) 

3) ll"ll<e, 

where the norm is the L^-norm. 

Denote functions ±K(j){x; B,C) that enter the expansion © a,s (p^. Then 
/ is their linear convex combination. Let us define the approximation at stage 
n -|- 1 as a convex combination fn+i = (1 ^ '^)./n + '^4' of fhs previous stage 
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approximation and one of the functions 4>^ . We will choose the approximation 
that solves the following problem 



inf 

AG[O,l],0e{0J 



{||(l-A)/„ + A0-/f } 



(8) 



Theorem 1 For each a G (0, 1) and n > 1, the approximant fn is guaranteed 
to satisfy the following inequality: 



\fn-f\\ <max' 



[{K+l)/af 



(9) 



Remark: By taking infinum over a we can obtain the following two coroUar 



les: 



Corollary 2 lfe = 0, then el < {K + Xfjn 
Corollary 3 If e > and n is sufficiently large, then 



2^2 {K+1) 



ei < 



(10) 



Proof of Theorem: Let /„ be the approximation obtained at step n of the 
recursive algorithm. Consider the next recursive step. First, we can write: 



(l-A)(/„-/) + A(,^-/)|f 



(11) 



+2A(1-A) ifn'f,^-f) + X^U-fr 



Next, using assumption l(33) and the Cauchy-Schwarz inequality, we obtain the 
following inequality: 



/„-/,5Z««<^.-/ < ll/n -/II 



(12) 



where ai are coefficients of a convex linear combination. Therefore, 

oo 

X! (/« - f,(t^i- f) < ^ne 

i=l 

From the positivity of ai it follows that we can find such a (jj^ that 

ifn - /, 0, - /) < £«£■ 



(13) 



In addition, assumptions Ql) and lI7|2) imply 110-/11^ < (||0|| + ||/||)^ < 
{K + 1)^ for any ip. Consequently, 1111) implies 



^ inf 11(1- A)(/„-/) + A(0 -/)||^ 
\e[o,i],4> 

< il-\)^el + 2X{l-X)een + X\K+lf 



(14) 
(15) 
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Choose 



A = 



{K + 1)^ - ££„ 

(A" + 1)2 + 4 _ 2ee„ 



(16) 



This is a vahd choice of A provided that e„ > e. For this choice we have the 
following bound for the error of the next step: 



~-n+l 



< e. 



"(i^ + l)2 + 4-2ee„- 



Therefore, 



-n+l 



> el 



> £l 



{sn - e) 



1 



1 



(i^+l)2 

Consequently, if e„ > e/ (1 ^ «), then we have a sequence of inequalities: 

2 



s. 



n+l 



> 



> 



K + 1 



K + 1 



Summing them up we get 



'n+l 



-n+l 



K+1 



> (n+ 1) 



K + 1 



< 



[iK + l)/a\ 



(17) 

(18) 
(19) 

(20) 
(21) 

(22) 

(23) 

(24) 

< 



Therefore, either £„ < e/ (1 — a) and then e„+i < e/ (1 ~ a) , or 
{n+l)-^{{K + l)/af. 
QED. 

The significance of Theorem ^ is that it shows that a large class of functions 
can be approximated to a high precision S by an expansion that has 0{S~'^) 
terms. In particular, the rate of growth of the number of terms is independent 
of the space dimension. Moreover, the theorem shows that the approximation 
can be found by the recursive optimization method. 

In practice the parameters of the expansion must be estimated from the 
values of the function on a discrete grid. This introduces an additional error in 
the approximation - the estimation error. The extent of this error is not analysed 



in this paper. However, Theorem and results in Niyogi and Girosi (1999) 



suggest that the number of the gridpoints needed to bring the approximation 
error below a certain threshold grows only polynomially with dimension. 
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4 Lower and Upper Bounds 



In practical applications, we are often interested not only in an estimate but also 
in the firm bounds on the option value. Fortunately, the Monte Carlo method 
and interpolations from the lattice method can be combined for the efficient 
calculation of these bounds. 

Indeed, let V{x,t) denote the approximation to the continuation value func- 
tion. Define the following stopping rule r: "Stop if V{x, t) < tt{x, ty . It simply 
tells the holder of the option to stop when the approximate continuation value 
of the option is smaller than the exercise value. Then the lower bound on the 
option value at time is given by 



(25) 



The expectation can be easily computed using Monte Carlo simulations of pos- 
sible factor paths. 

The upper bound can be computed using the duality method developed in 



Rogers (2002) and Haugh and Kogan (2004) Let e ^*Mt be a supermartingale. 
Then 

V = snpEQe-''^n{xr,T) (26) 

r 

= sup {£;oe"''^^ [nixr, r) - AU] + Eoe'''^ Mr] (27) 

r 

< sup Eoe^"'' [K{xr, t) ~ Mr] + Mo (because Eoe'''^ Mr < Mo) (28) 

r 

< Eo sup e"''* [T:{xt,t) - Mt] + Mq (29) 

t 

Consequently, if we manage to find such a supermartingale that the expectation 
of e"*"^ [^{xr, t) — Mr] is uniformly small, then we can calculate a good upper 
bound on the option price. 

Intuitively, supermartingale M represents a replicating portfolio that the 
writer of the option constructs to hedge his position. The portfolio should be 
designed in such a way that it covers or almost covers the funds needed in the 
case of the option exercise. The possible deficit in funds is measured by the 
difference ■K{xt, t) — Mt, and the value of the option cannot be larger than the 
sum of the replicating portfolio value Mq and the discounted expectation of the 
supremum of the deficit. 

The main question is how to find a good supermartingale M. One pos- 
sible way is to choose the martingale distillation of the process V{xt,t) = 

; |v^(a;t, t), 7r(xt, i)| . In this case we define 



t+i 



e-'-'Mt + EtV(xt+ut + 1) - V{xut). 



(30) 



The expectation in this expression can be computed numerically. The choice 
will provide a good upper bound provided that the approximation V is good 
approximation to the option continuation value. 
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5 Application 



The main goal of this section is to show that the method described above is a 
viable alternative to approximate Monte Carlo methods. It demonstrates this 
by computing values of rainbow options. These securities have the following 
payoff structure: 

7r = max(/(X,5i,...,5„),0). (31) 



For example, the put option on the minimum of several assets has 

f = K - min(5i, S'„). 



(32) 



This set of securities is a convenient benchmark for testing a pricing method 
because it was extensively studied in the literature. For the case of European 
puts on the minimum or the maximum there are analytic formulas derived by 
Stulz (1982) and Johnson (1987) The American put on the minimum of two 



assets is priced in Boyle (1988) by a variant of the multinomial lattice method. 
Boyle, Evnine, and Gibbs (1989) give the results of a lattice method for three- 



dimensional European puts. Broadie and Glasserman (1997a)[[Broadie and Glasserman (1997b)[ 
Raymar and Zwecher (1997), B roadie, Glasserman, and Gain (1997)| and |Longstaff and Schwartz (2001) 
use two- and five-dimensional options as a benchmark for their Monte Carlo sim- 
ulation methods. 

The most essential choice in our algorithm was how many gridpoints to use. 
The grid was constructed by generating Sobol's quasi-Monte Carlo sequences 
in a hypercube, and then transfoming them by an appropriate Gaussian distri- 
bution. The number of grid points was determined using the cross-validation 
method.^ Namely, the grid was divided into two portions: training and vali- 
dation sets. The approximation was found using the training set only, and its 
quality was evaluated on the validation set by computing the mean squared error 
(MSE) . The new terms in the approximative expansion were added only if they 
decreased the MSE criterion. The lowest value of the MSE was taken as the 
performance measure for a given number of data points. If it was unsatisfactory, 
the number of grid points was increased and the procedure repeated. 

Table 1 summarizes the results of the application of the Interpolative Lattice 
(IL) method to European and American puts on the minimum of two assets. 
The results are compared to the results of the analytical formula and to the 
results of the lattice method from Boyle (1988) This table shows that IL gives 



high precision results for European options where the difference is on average 
less than a cent. For American puts the results of the IL and Boyle methods 
are slightly different but still are very close to each other. 

Table 2 shows the results of IL application to valuation of various put op- 



tions on three assets. The parameters are as in Boyle, Evnine, and Gibbs (1989) 



but since the results in Boyle, Evnine, and Gibbs (1989) apparently contain a 
miscalculation, the results obtained by the Monte Carlo method are used as a 



^See |Hastie, Tibshirani, and Friedman (2001)| for a fuller description of the cross-validation 
method. 
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benchmark.^ The results for European options are generally in good agreement 
with the Monte Carlo results, with the difference less than 5% of the option 
value. 

Table 3 shows the results of pricing American puts on various functions of 
five asset prices. The results are compared with results by the Least Squares 
Monte Carlo, binomial and Berridge-Schumacher methods as they are reported 
in Berridge and Schumacher (2004) The results are in good agreement. For the 



put on the geometric average of the five asset prices, the dimension reduction is 
possible and we are able to compute the exact price. In this case , the interpola- 
tive lattice method slightly overestimates the true price but the difference is less 
than 3%. The interpolative lattice method gives an estimate that in three cases 
exceeds and in one case falls below the estimate given by Least Squares Monte 
Carlo. This evidence suggests that the interpolative lattice method tends to 
give an estimate that is biased upward. 

The pricing times of the interpolative lattice method are rather long in the 
case of five assets - around 2 hours. Most of the time goes into the search for a 
good adaptive approximation. 

In summary, it appears that pricing by adaptive interpolation is a viable 
alternative to other methods of pricing multidimensional options including Least 
Squares Monte Carlo. 



6 Conclusion 

This paper proposes a novel variant of the lattice option pricing, which is 
based on modern methods of adaptive interpolation. In multiple dimensions 
the method allows using irregular grids and thus avoids both the curse of di- 
mensionality and the necessity to build recombining trees. The method is easy 
to apply to many examples of derivative securities and its practical viability is 
corroborated by numerical examples. 

A theorem is demonstrated that suggests that the new method is less likely 
to be vulnerable to the curse of dimensionality. Much further research is needed, 
however, to determine the convergence properties of the new method. 
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